A Fast Algorithm for Finding Correlation Clusters in Noise Data
نویسندگان
چکیده
Noise significantly affects cluster quality. Conventional clustering methods hardly detect clusters in a data set containing a large amount of noise. Projected clustering sheds light on identifying correlation clusters in such a data set. In order to exclude noise points which are usually scattered in a subspace, data points are projected to form dense areas in the subspace that are regarded as correlation clusters. However, we found that the existing methods for the projected clustering did not work very well with noise data, since they employ randomly generated seeds (micro clusters) to trade-off the clustering quality. In this paper, we propose a divisive method for the projected clustering that does not rely on random seeds. The proposed algorithm is capable of producing higher quality correlation clusters from noise data in a more efficient way than an agglomeration projected algorithm. We experimentally show that our algorithm captures correlation clusters in noise data better than a well-known projected clustering
منابع مشابه
Acoustic correlated sources direction finding in the presence of unknown spatial correlation noise
In this paper, a new method is proposed for DOA estimation of correlated acoustic signals, in the presence of unknown spatial correlation noise. By generating a matrix from the signal subspace with the Hankel-SVD method, the correlated resource information is extracted from each eigen-vector. Then a joint-diagonalization structure is constructed of the signal subspace and basis it, independent...
متن کاملبررسی مشکلات الگوریتم خوشه بندی DBSCAN و مروری بر بهبودهای ارائهشده برای آن
Clustering is an important knowledge discovery technique in the database. Density-based clustering algorithms are one of the main methods for clustering in data mining. These algorithms have some special features including being independent from the shape of the clusters, highly understandable and ease of use. DBSCAN is a base algorithm for density-based clustering algorithms. DBSCAN is able to...
متن کاملA Multi-Objective Approach to Fuzzy Clustering using ITLBO Algorithm
Data clustering is one of the most important areas of research in data mining and knowledge discovery. Recent research in this area has shown that the best clustering results can be achieved using multi-objective methods. In other words, assuming more than one criterion as objective functions for clustering data can measurably increase the quality of clustering. In this study, a model with two ...
متن کاملImprovement of density-based clustering algorithm using modifying the density definitions and input parameter
Clustering is one of the main tasks in data mining, which means grouping similar samples. In general, there is a wide variety of clustering algorithms. One of these categories is density-based clustering. Various algorithms have been proposed for this method; one of the most widely used algorithms called DBSCAN. DBSCAN can identify clusters of different shapes in the dataset and automatically i...
متن کاملAdaptive-Filtering-Based Algorithm for Impulsive Noise Cancellation from ECG Signal
Suppression of noise and artifacts is a necessary step in biomedical data processing. Adaptive filtering is known as useful method to overcome this problem. Among various contaminants, there are some situations such as electrical activities of muscles contribute to impulsive noise. This paper deals with modeling real-life muscle noise with α-stable probability distribution and adaptive filterin...
متن کامل